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Description 

BACKGROUND OF THE INVENTION 
5 Field of the invention 

[0001] ■ This invention relates to a signal processing apparatus or system carrying out signal processing with the use 
of a so-called neural network made up of a plurality of units each taking charge of signal processing corresponding to 
that of a neuron, and a learning processing apparatus or system causing a signal processing section by said neural 
10 network to undergo a learning processing in accordance with the learning rule of back propagation. 

Prior art 

[0002] The learning rule of back propagation, which is a learning algorithm of the neural network, has been tentatively 
15 applied to signal processing, including high speed image processing or pattern recognition, as disclosed in "Parallel 
Distributed Processing", vol. 1 , The MIT Press, 1 986 or "Nikkei Electronics", issue of August 1 0, 1 987, N° 427, pp 115 
to 1 24. The learning rule of back propagation is also applied, as shown in Fig. 1 , to a multistorey neural network having 
an intermediate layer 2 between an input layer 1 and an output layer 3. 

[0003] Each unit Uj of the neural network shown in Fig. 1 issues an output value which is the total sum netj of output 
20 values O, of a unit Uj coupled to the unit Uj by a coupling coefficient Wj|, transformed by a predetermined function f, 
such as a sigmoid function. That is, when the value of a pattern p is supplied as an input value to each unit u f of the 
input layer 1 , an output value O pj of each unit Uj of the intermediate layer 2 and the output layer 3 is expressed by the 
following formula (1) 



°Pi = ¥ n *pi) 



so [0004] The output value O pj of the unit Uj of the output layer 3 may be obtained by sequentially computing the output 
values of the inputs u- y each corresponding to a neuron, from the input layer 1 towards the output layer 3. 
[0005] In accordance with the back-propagation learning algorithm, the processing of learning consisting in modifying 
the coupling coefficient Wj, so as to minimize the total sum E p of square errors between the actual output value O pj of 
each unit u f of the output layer 3 on applicatiorl of the pattern £ and the desirable output value t pj , that is the teacher 

3S signal, 



Ep^fOpj-Opj) 2 & 

is sequentially performed from the output layer 3 towards the input layer 1 . By such processing of learning, the output 
value O pj closest to the value t pj of the teacher signal is output from the unit Uj of the output layer 3. 
[0006] If the variant A Wj, of the coupling coefficient Wj, which minimizes the total sum E p of the square errors is set 
so that 

A Wjj a - d Ep/aWj, (3) 



the formula (3) may be rewritten to 

AWji-^SpjOpj 



as explained in detail in the above reference materials. 

[0007] In the above formula (4), r\ stands for the rate of learning, which is a constant, and which may be empirically 
determined from the number of the units or layers or from the input or output values. 8 pj stands for the error proper to 
the unit Uj. 
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[0008] Therefore, in determining the above variant AWj,, it suffices to compute the error 5 pj in the reverse direction, 

or. from the output layer towards the input layer of the network. 

[0009] The error S pj of the unit Uj of the output layer 1 is given by the formula (5) 

^i=V°Pi> f, i< net i> (5) 

On the other hand, the error 5 pj of the unit Uj of the intermediate layer 2 may be computed by a recurrent function of 
the following formula (6) 

&pj = fj(n e tj)£pkW kj (6) 

using the error S pk and the coupling coefficient W kj of each unit u k coupled to the unit Uj, herein each unit of the output 
layer 3. The process of finding the above formulas (5) and (6) is explained in detail in the above reference materials. 
[0010] In the above formulas, f'i(netj) stands for the differentiation of the output function fj(netj). 
[0011] Although the variant Wjj may be found from the above formula (4), using the results of the formulas (5) and 
(6), more stable results may be obtained by finding it from, the following formula (7) 

AW jj (n+1)= 1 i.8 pj O pi + a.AW ii(n) (7) 

with the use of the results of the preceding learning. In the above formula, a stands for a stabilization factor for reducing 
the error oscillations and accelerating the convergence thereof. 

[0012] The above described learning is repeated until it is terminated at the time point when the total sum E p of the 
square errors between the output value O p j and the teacher signal t pj becomes sufficiently small. 
[0013] It is noted that, in the conventional signal processing system in which the aforementioned back-propagation 
learning rule is applied to the neural network, the learning constant is empirically determined from the numbers of the 
layers and the units corresponding to neurons or the input and output values, and the learning is carried out at the 
constant learning rate using the above formula (7). Thus the number of times of repetition n of the learning until the 
total sum E p between the output value O pj and the teacher signal t pj becomes small enough to terminate the learning 
may be enormous to render the efficient learning unfeasible. 

[001 4] Also, the above described signal processing system is constructed as a network consisting only of feedforward 
couplings between the units corresponding to the neurons, so that, when the features of the input signal pattern are 
to be extracted by learning the coupling state of the above mentioned network from the input signals and the teacher 
signal, it is difficult to extract the sequential time series pattern or chronological pattern of the audio signals fluctuating 
on the time axis. 

[0015] In addition, while the processing of learning of the above described multistorey neural network in accordance 
with the back-propagation learning rule has a promisingly high functional ability, it may occur frequently that an optimum 
global minimum is not reached, but only a local minimum is reached, in the course of the learning process, such that 
the total sum E p of the square errors cannot be reduced sufficiently. 

[0016] Conventionally, when such local minimum is reached, the initial value or the learning rateri is changed and 
the processing of learning is repeated until finding the optimum global minimum. This results in considerable fluctuations 
and protractions of the learning processing time. 

[0017] The paper by Kung et al entitled "An Algebraic Projection Analysis for Optimal Hidden Units Size and Learning 
Rates in Back-Propagation Learning" (IEEE International Conference on Neural Networks, San Diego, California, July 
24-27, 1 988) seeks to optimize a learning process consisting of iterative application of the back propagation algorithm 
and uses an approach whereby it is sought to minimise the value of a modified measure of mean squared error. 
[0018] The paper by Koutsougeras et al entitled "Training of a Neural Network for Pattern Classification Based on 
an Entropy Measure" (IEEE International Conference on Neural Networks, San Diego, California, July 24-27, 1988) 
discusses a neural network having a branching structure designed to partition n-dimensional space into different dis- 
crete regions corresponding to respective different classes of patterns. The branching structure is incrementally built 
up during a training phase where input patterns are applied to the individual neurons of the network and threshold and 
weight values of the neurons are adjusted starting from the input layer and proceeding towards the output laser. Extra 
neurons are added as needed in order adequately to partition the n-dimensional space. 
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Objects of the invention 

[0019] It is a primary object of the present invention to provide a learning processing system in which the signal 
processing section of the neural network is subjected to learning processing in accordance with a back-propagation 
5 learning rule, wherein the local minimum state in the learning processing process may be efficiently avoided for realizing 
an optimum global minimum state quickly and stably. 

Summary of the invention 

w [0020] For accomplishing the primary object of the present invention, the present invention provides a learning 
processing system in which the learning processing section executes the learning processing of the coupling strength 
coefficient as it increases the number of units of the intermediate layer. 

[0021] The above and other objects and novel features of the present invention will become apparent from the fol- 
lowing detailed description of the invention which is made in conjunction with the accompanying drawings and the new 
is matter pointed out in the claims. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0022] Fig. 1 is a diagrammatic view showing the general construction of a neural network to which the back-prop- 
20 agation learning rule is applied. 

[0023] Fig. 2 is a block diagram schematically showing the construction of an illustrative example of a signal process- 
ing system. 

[0024] Fig. 3 is a diagrammatic view of a neural network showing the construction of the signal processing section 
of the signal processing system according to the system shown in Fig. 2. 
25 [0025] , Fig. 4 is a flow chart showing the process of learning processing in the learning processing section constituting 
the signal processing system of the system shown in Fig. 2. 

[0026] Fig. 5 is a block diagram schematically showing the construction of the learning processing system according 
to the present invention. 

[0027] Figs. 6A and 6B are diagrammatic views showing the state of the signal processing section at the start and 
30 in the course of learning processing in the learning processing system shown in Fig. 5. 

[0028] Fig. 7 is a flow chart showing a typical process of learning processing in the learning processing section 
constituting the learning processing system shown in Fig. 5. 

[0029] Fig. 8 is a chart showing the typical results of tests of learning processing on the signal processing section 
of the neural network shown in Fig. 5 by the learning processing section of the learning processing system. 
35 [0030] Fig. 9 is a chart showing the results of tests of learning on the signal processing section of the neural network 
shown in Fig. 3, with the number of units of the intermediate layer fixed at six. 

[0031] Fig. 1 0 is a chart showing the results of tests of learn ing on the signal processing system of the neural network 
shown in Fig. 3, with the number of units of the intermediate layer fixed at three. 

40 DETAILED DESCRIPTION OF THE EMBODIMENTS 

[0032] By referring to the drawings, certain preferred embodiments of the present invention will be explained in more 
detail. 

[0033] An illustrative example of signal processing system will be hereinafter explained. 
45 [0034] As shown schematically in Fig. 2, the signal processing system of the present illustrative example includes 
a signal processing section 30 for obtaining the output value O pj from the input signal patterns p and a learning process- 
ing section 40 for causing the signal processing section 30 to undergo learning to obtain the output value O pj closest 
to the desired output value t pj from the input signal patterns p. 

[0035] The signal processing section 30 is formed, as shown in Fig. 3, by a neural network of a three-layer structure 
so including at least an input layer L,, an intermediate layer L H and an output layer L D . These layers L,, L H and L 0 are 
constituted by units u n to u ix , u H1 to u Hy , and u Q1 to u Dz , each corresponding to a neuron, respectively, where x, y and 
z stand for arbitrary members. Each of the units u H1 to n^ and u 01 to u Qz of the intermediate layer L H and the output 
layer Lq is provided with delay means and forms a recurrent network including a loop LP having its output O j(t) as its 
own input by way of the delay means and a feedback FB having its output O j(1) as an input to another unit. 
55 [0036] In the signal processing system 30, with the input signal patterns p entered into each of the units u n to u, x of 
the input layer L,, the total sum netj of the inputs to the units u H1 ~to u^ of the intermediate layer L H is given by the 
following formula (17): 
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net j = ¥ l^o w jx*k+e Oie(t-k) 

v NH 

+ 5 ?k ? il w jy'k + i OhiO-k) 

z NO . . • 

+ f ^ i w j'z-k+i °oiO- k ) 

+ 6j (17) 

Each of the units u H1 to u Hy of the intermediate layer L H issues, for the total sum netj of the input signals, an output 
value 0 Hj(t) represented by the sigmoid function of the following formula (18): 



20 [0037] The total sum netj of the inputs to the units u Q1 to u Dz of the output layer Lq is given by the following formula 
(19): . 

x NH 

uetj=f f= Q w jx .. k+i O H i( t -k) 



z NO 

+ i f=l w jz*k+i °Hi(t-k) 



(19) 



While each of the units u 01 to u Qz of the output layer L Q issues, for the total sum netj of the inputs, an output value 
.O oj / t) represented by the following formula (20): 



where Oj stands for a threshold value and Nl, NH and NO stand for the numbers of the delay means provided in 
the layers L,, L H and L D , respectively. 

[0038] The learning processing section 40 computes the coefficient Wj, of coupling strength between the units u D1 
to u Dz , u H1 to Uh x and u n to Uj x , from the output layer Lq towards the input layer L,, sequentially and repeatedly, 
according to the sequence shown in the flow chart of Fig. 4, while executing the learning processing of the coupling 
coefficient Wj, so that the total sum of the square errors LMS between the desired output value t pj afforded as the 
teacher signal and the output value O oj of the output layer Lq will be sufficiently small. By such learning processing, 
the learning processing section 40 causes the output value O oj of the output layer L D to be closest to the desired output 
value t 2r , afforded as the teacher signal patterns, for an input signal pattern p (xr) supplied to the signal processing 
section 30. This pattern p (xr) represents an information unit a whole which fluctuates along the time axis and represented 
by the xr number of data, where r stands for the number of times of sampling of the information unit and x the number 
date in each sample. 

[0039] That is, the section 40 affords at step 1 the input signal patterns p (xr) to each of the units u n to u, x of the input 
layer L,, and proceeds to computing at step 2 each output value O pj(t) of each of the units u H1 to u Hy and u D1 to u Dz of 
the intermediate layer L H and the output layer Lq. 

[0040] The section 40 then proceeds to computing at step 3 the error 5 pj of each of the units u 01 to u Dz and u H1 to 
u Hy , from the output layer L D towards the input layer L,, on the basis of the output values O pj(t) and the desired output 
value t zr afforded as the teacher signal. 

[0041] In the computing step 3, the error 8 oj of each of the units u D1 to u Dz of the output layer Lq is given by the 
following formula (21): 
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5oj = (t pj -O oj )O oj (1-O oj ) (21) 

wherein the error 8 pj of each of the units u H1 to u Hy of the intermediate layer L H is given by the following formula (22) 

5Hi = 0 Hj( 1 -°Hi)f 5 0k W N (22) 

[0042] Then, in step 4, the learning variable of the coefficient Wj, of coupling strength from the i'th one to the j'th 
one of the units u n to U| X , u H1 to u Hy and u Q1 to u Qz is computed by the following formula (23) 

TTT < 23 > 



" I Opi 2 + 1 



in which the learning variable is represented by the reciprocal of the square sum of the input values added to by 1 
as a threshold value. 

[0043] Then, in step 5, using the learning variable computed in step 4, the variant Aw^ of the coupling coefficient 
Wji from the i'th one to the jth one of the units u 01 to u Cz , u H1 to u Hy and u M to.U| X is computed in accordance with the 
following formula (24): 

AW ji(n) = T,.p (5pj O pl ) (24) 

In the formula, r\ stands for a learning constant. 

[0044] Then, in step 5, the total sum LMS of the square errors on the units with respect to the teacher signal is 
computed in accordance with the formula (25): 



LMS = p " 1 r x (tpi - Op;) 



(25) 



[0045] Then, in step 6, it is decided whether the processing of the steps 1 through 5 has been performed on the R- 
number of input signal patterns p xr . If the result of decision at step 6 is NO, the section 40 reverts to step 1 . When the 
result of decision at step 6 is YES, that is, when all of the variants A Wp of the coupling coefficient between the units 
u D1 to u 02 , u H1 to u y and u n to U, x are computed for the input signal patterns p xr , the section 40 proceeds to step 7 to 
execute decision of the converging condition for the output value O oj obtained at the output layer L Q on the basis of 
the total sum LMS of square errors between the output value O oj and the desired output value t pj afforded as the teacher 
signal. 

[0046] In the decision step 7, it is decided whether the output value O oj obtained at the output layer L 0 of the signal 
processing section 30 is closest to the desired output value t pj afforded as the teacher signal. When the result of decision 
at step 7 is YES, that is, when the total sum LMS of the square errors is sufficiently small and the output value O cj is 
closest to the desired output value t pj , the learning processing is terminated. If the result of decision at step 7 is NO, 
the section 40 proceeds to computing at step 8. 

[0047] In this computing step 8, the coupling coefficient Wj, between the units u OT to u Dz , u H1 to u Hy and u n to u, x is 
modified, on the basis of the variant W^ of the coupling coefficient Wj, computed at step 5, in accordance with the 
following formula (26) 

^m= AW m +aAW K^) (26) 

and the following formula (27) 

W ji(n+1) =W ji(n) + A Wji(n) ( 27 ) 
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[0048] After the computing step 8, the section 40 reverts to step 1 to execute the operation of steps 1 to 6. 
[0049] Thus the section 40 executes the operations of the steps 1 to 8 repeatedly and, when the total sum LMS of 
the square errors between the desired output value t pj and the actual output value O oj becomes sufficiently small and 
the output value O oj obtained at the output value Lq of the signal processing section 30 is closest to the desired output 

s value tpj afforded as the teacher signal, terminates the processing of learning by the decision at step 7. 

[0050] In this manner, in the illustrative example signal processing system, the learning as to the coupling coefficient 
Wj| between the units u D1 to u 0z , u H1 to u Hy and u M to u, x of the signal processing section 30 constituting the recurrent 
network inclusive of the above mentioned loop LP and the feedback FB is executed by the learning processing section 
40 on the basis of the desired output value t pj afforded as the teacher signal. Hence, the features of the sequential 

10 time-base input signal pattern p^, such as audio signals, fluctuating along the time axis, may also be extracted reliably 
by the learning processing by the learning processing section 40. Thus, by setting the coupling state between the units 
u Q1 to u Qz , u H1 to u Hy and u M to U| X of the signal processing section 30 by the coupling coefficient W^, obtained as the 
result of learning by the learning processing section 40, the time-series input signal pattern p xr can be subjected to 
desired signal processing by the signal processing section 30. 

15 [0051] Moreover, in the illustrative example system, the learning constant t\ is normalized by the learning constant 
p indicated as the reciprocal of the square sum of the input values at the units u H1 to u Hy and u D1 to u Qz , and the 
learning processing as to the coupling coefficient Wj, is performed at the dynamically changing learning rate, as a 
function of the input value O pi , so that learning can be performed stably and expeditiously with a small number of times 
of learning. 

20 [0052] In this manner, in the illustrative example signal processing system, signal processing for input signals is 
performed at the signal processing section 30 in which the recurrent network inclusive of the loop LP and the feedback 
FB is constituted by the units u H1 to u Hy and u Q1 to u Qz of the intermediate layer L H and the output layer L D each 
provided with delay means. In the learning processing section 40, the learning as to the coupling state of the recurrent 
network by the units u H1 to u Hy and u Q1 to u Qz constituting the signal processing section 30 is executed on the basis 

25 ■ of the teacher signal. Thus the features of the sequential time-base patterns, fluctuating along the time axis, such as 
audio signals, can be extracted by the above mentioned learning processing section to subject the signal processing 
section to the desired signal processing. 

[0053] A preferred illustrative embodiment learning processing system according to the present invention will be 
hereinafter explained. 

30 [0054] The basic construction of the learning processing system according to the present invention is shown in Fig. 
5, As shown therein, the system includes a signal processing section 50 constituted by a neural network of a three- 
layered structure including at least an input layer L,, an intermediate layer L H and an output layer L D , each made up 
• of plural units performing a signal processing corresponding to one of a neuron, and a learning processing section 60 
subjecting the learning processing to the signal processing consisting in sequentially repeatedly computing the coef- 

35 ficient of coupling strength between the above units from the output layer Lq towards the input layer L, on the basis 
of the error data 8 pj between the output value of the output layer L D and the desired output value O pj afforded as the 
teacher signal t pj , for the input signal patterns p entered into the input layer L, of the signal processing section 50, and 
learning the coupling coefficient Wj, in accordance with the back-propagation learning rule. 

[0055] The learning processing section 60 executes the learning processing of the coupling coefficient Wj, as it causes 
40 the number of the units of the intermediate layer L H of the signal processing section 50 to be increased, and thus the 
section 60 has the control function of causing the number of units of the intermediate layer L H to be increased in the 
course of learning processing of the coupling coefficient W^. The learning processing section 60 subjects the signal 
processing section 50 having the input layer L,, an intermediate layer. L H and an output layer L D made up of arbitrary 
numbers x, y and z of units u M to u, x , u H1 to u^ and u D1 to u Qz , each corresponding to a neuron, respectively, as shown 
45 in Fig. 6A, to learning processing as to the coupling coefficient Wj,, while the section 60 causes the number of the unit 
L H to be increased sequentially from y to (y+m), as shown in Fig. 6B. 

[0056] It is noted that the control operation of increasing the number of the units of the intermediate layer may 
be performed periodically in the course of learning processing of the coupling coefficient Wjj, or each time the occurrence 
of the above mentioned local minimum state is sensed. 

50 [0057] The above mentioned learning processing section 60, having the control function of increasing the number 
of the units of the intermediate layer L H in the course of learning processing of the coupling coefficient Wj h subjects 
the signal processing section 50 formed by a neural network of a three-layer structure including the input layer L,, 
intermediate layer L H and the output layer Lq to the learning processing of the coupling coefficient Wjj, as it causes 
the number of units of the intermediate layer L H to be increased. Thus, even on occurrence of the local minimum state 

55 in the course of learning of the coupling coefficient Wj,, the sectipn 50 is able to increase the number of units of the 
intermediate layer L H to exit from such local minimum state to effect rapid and reliable convergence into the optimum 
global minimum state. 

[0058] Tests were conducted repeatedly, in each of which the learning processing section 50 having the control 
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function of increasing the number of units of the intermediate layer in the course of learning of the coupling coefficient 
Wj| causes the signal processing section 60 constituting the recurrent network including the feedback FB and the loop 
LP in the illustrative example, signal processing system of Figs. 2-4 to undergo the process of learning the coefficient 
Wj|, with the number of the units of the input layer L, of 8(x=8), that of the output layer L 0 of 3(z=3), the number of the 

5 delay means of each layer of 2 and with the input signal pattern p during learning operation, using 21 time-space 
patterns of 1 =8x7, and the processing algorithm shown in the flow chart of Fig. 7, with the learning being started at the 
number of the units of the intermediate layer L H of 3(y=3) and with the number of the units of the intermediate layer 
L H being increased during the learning process. By increasing the number of the units of the intermediate layer 
three to five times, the test results were obtained in which the convergence to the optimum global minimum state were 

10 realized without going into the local minimum state. 

[0059] Fig. 8 shows, as an example of the above tests, the test results in which learning processing of converging 
into the optimum minimum state could be achieved by adding the units of the intermediate layer at the timing shown 
by the arrow mark in the figure and by increasing the number of the intermediate layer L H from three to six. The ordinate 
in Fig. 8 stands for the total sum LMS of the quadratic errors and the abscissa stands for the number of times of the 

is learning processing operations. 

[0060] The processing algorithm shown in the flow chart of Fig. 7 is explained. 

[0061] In this processing algorithm, in step 1, the variable K indicating the number of times of the processing for 
detecting the local minimum state is initialized to "0", while the first variable Lms for deciding the converging condition 
of the learning processing is also initialized to 10000000000. 
20 [0062] Then, in step 2, the variable n indicating the number of times of learning of the overall learning pattern, that 
is, thej-numberof the input signal patterns^, is initialized. The program then proceeds to step 3 to execute the learning 
processing of the l-number of the input signal patterns q. 

[0063] Then, in step 4, decision is made of the variable n indicating the number of times of learning. Unless n=3, the 
program proceeds to step 5 to add one to n (n -> n+1 ), and then reverts to step 3 to repeat the learning processing. 
25 When n=3, the program proceeds to step 6. 

[0064] In step 6, after the value of the first variable Lms is maintained as the value of the second variable Lms (-1 ) 
for deciding the converging condition of the learning processing, the total sum of the square errors between the output 
signal and the teacher signal in each unit is computed in accordance with the formula (28), this value being then used 
as the new value for the first variable Lms, such that 

30 

1 m 

Lms= I I (t.,-0,.) 2 (28) 
p=l i=! 

[0065] Then, in step 7, the first variable Lms for deciding the converging condition of the learning processing is 
compared with the second variable Lms(-1). If the value of the first variable Lms is lesser than that of the second' 
variable Lms(-1 ), the program proceeds to step 8 to decide whether or not the variable k indicating the number of times 
of the processing operations for detecting the local minimum state is equal to 0. 
40 [0066] If, in step 8, the variable k is 0, the program reverts directly to step 2. If the variable k is not 0, setting of k-*k- 
1 is made in step 9. The program then reverts to step 2 to initialize n to 0(n=0) to execute the learning processing of 
thej-numberof the input signal patterns £ in step 3. 

[0067] If, in step 7, the value of the first variable Lms is larger than that of the second variable Lms(-1 ), the program 
proceeds to step 10 to set the value of k indicating the number of times of the processing operations for detecting the 

45 local minimum state (k-» k+ 1). Then, in step 11, it is decided whether or not the value of k is 2. 

[0068] If, in step 11 , the value of the variable k is not 2, the program reverts directly to step 2. If the variable k is 2, 
it is decided that the local minimum state is prevailing. Thus, in step 12, control is made of increasing the number of 
the units of the intermediate layer L H . Then,< in step 1 3, setting of k=0 is made. The program then reverts to step 2 for 
setting of n=0 and then proceeds to step 3 to execute the learning processing of the above mentioned J-number of the 

so input signal patterns p. 

[0069] Test on the learning processing was conducted of the signal processing section 50 of the above described 
illustrative example signal processing system of Figs. 2-4 constituting the recurrent network including the feedback 
loop FB and the loop LP shown in Fig. 3, with the number of the units of the intermediate layer 1+, being set to six (y=6). 
The test results have revealed that the learning processing need be repeated an extremely large number of times with 
55 considerable time expenditure until the convergence to the optimum minimum state was achieved, and that the local 
minimum state prevailed for three out of eight learning processing tests without convergence to the optimum global 
minimum state. 

[0070] Fig. 9 shows, by way of an example, the results of the learning processing tests in which the local minimum 
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state was reached. 

[0071] In this figure, the ordinate stands for the total sum LMS of the square errors and the abscissa stands for the 
number of times of the learning processing operations. 

[0072] Also the tests on the learning processing was conducted 30 times on the signal processing section 50 of the 
s above described illustrative example signal processing system constituting the recurrent network including the feed- 
back loop FB and the loop LP shown in Fig. 3, with the number of the units of the intermediate layer Lh being set to 
three (y=3). It was found that, as shown for example in Fig. 10, the local minimum state was reached in all of the tests 
on learning processing without convergence to the optimum global minimum state. 

[0073] In Fig. 10, the ordinate stands for the total sum LMS of the square errors and the abscissa stands for the 

io number of times of the learning processing operations. 

[0074] From the foregoing it is seen that the present invention provides a learning processing system in which the 
learning processing of the coefficient of coupling strength is performed, while the number of the units of the intermediate 
layer is increased by the learning processing section, whereby the convergence to the optimum global minimum state 
is achieved promptly and reliably to achieve the stable learning processing to avoid the local minimum state in the 

'5 learning processing process conforming to the back-propagation learning rule. 



Claims 

so 1. a learning processing system (50, 60) comprising: 

a signal processing section (50) composed of a multi-layer neural network having an input layer (L,), a bidden 
layer (L H ) and an output layer (L Q ), the layers being made up of units u M to u, x , u H1 to u Hy and u D1 to u 0z , 
respectively, each unit corresponding to a neuron; and 

ss a learning processing section (60) executing a learning process using a back-propagation learning algorithm, 

the process consisting in sequentially modifying, from the output layer towards the input layer, the coupling 
coefficients Wji of all units j in the hidden and in the output layer by a variant AWj, so as to minimize the total 
sum of square errors between the actual output O pj of unit j in the output layer (Lq) produced from an input 
signal pattern (p) and the desirable output value (teacher signal) for said unit j in the output layer (Lq), 

30 whereby Wj, is the weight for the signal from the ith to the jth unit, 

the learning processing section being fed with a desired output value t pj as a teacher signal for the output 
value 0^ of the unit j in the output layer (Lq) for the input patterns p entered into the input layer (L,), 
the learning processing section (60) computing the error value for each unit in the output layer and in the 
bidden layer, 

35 said learning process being executed repeatedly until the total sum (E) of the square error between the desired 

output afforded as the teacher signal and the output signal becomes sufficiently small; 

characterised in that the learning processing section (60) comprises control means for increasing the number 
of units in the bidden layer (L H ), during the repeated execution of said learning process, either periodically or when 
40 a local minimum state of the signal processing system has been detected, said learning processing section (GO) 

being adapted in subsequent repeated executions of the learning process to perform learning processing of the 
coefficients of coupling strength in respect of the increased number of units in the hidden layer. 

2. The learning processing system of claim 1 , wherein the control means of the learning processing section (60) is 
45 adapted to detect a local minimum state by comparing successive values of a first variable, Lms, where 

1 m 

Lms = I X (tpj - O pi ) 2 
50 p=l i=l 



Patentanspruche 

55 

1. Lernverarbeitungssystem (50, 60) 

mit einem Signalverarbeitungsabschnitt (50), der aus einem neurohalen Mehrschichten-Netzwerk mit einer 
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Eingangsschicht (L,), einer verborgenen Schicht (L H ) und einer AusgangsSChiCht (Lq) zusammengesetzt ist, 
wobei die Schichten aus Einheiten u n bis u !x , u H1 bis u Hy bzw. u Q1 bis bestehen und jede Einheit einem 
Neuron entspricht, 

und mit einem Lernverarbeitungsabschnitt (60), der einen LernprozeG unter Verwendung eines sich ruckwarts 
ausbreitenden Algorithmus ausfuhrt, wobei der ProzeG darin besteht, dafJ von der Ausgangsschicht in Rich- 
tung zu der Eingangsschicht die Kopplungskoeffizienten Wp aller Einheiten j in der verborgenen Schicht und 
in der Ausgangsschicht durch eine Variante AWj, so modifiziert werden, daG die Gesamtsumme der quadra- 
tischen Fehler zwischen dem durch ein Eingangssignalmuster (p) erzeugten tatsachlichen Ausgangswert O pj 
der Einheit in der Ausgangsschicht L Q und dem gewunschten Ausgangswert t pj (Lehrersignal) fur diese Einheit 
j in der Ausgangsschicht (Lq) minimiert wird, wobei W,-, das Gewicht fur das Signal aus der i-ten Einheit zu 
der j-ten Einheit ist, 

wobei dem Lernverarbeitungsabschnitt ein gewunschter Ausgangswert t pj als Lehrersignal fur den Ausgangs- 
wert 0^ der Einheit] in der Ausgangsschicht (L Q ) fur die in die Eingangsschicht (L,) eingegebenen Eingangs- 
muster p zugefuhrt wird, 

wobei der Lernverarbeitungsabschnitt (60) den Fehlerwert fur jede Einheit in der Ausgangsschicht und in der 
verborgenen Schicht berechnet, 

und wobei der LernprozeG wiederholt durchgef uhrt wird, bis die Gesamtsumme (E) des quadratischen Fehlers 
zwischen dem als Lehrersignal angebotenen gewunschten Signal und dem Ausgarigssignal hinreichend klein 
wird, 

dadurch gekennzeichnet, 

daG der Lernverarbeitungsabschnitt (60) eine Steuereinrichtung enthalt, urn die Zahl der Einheiten in der ver- 
borgenen Schicht (L H ) wahrend der wiederholten Durchfuhrung des Lernprozesses entweder periodisch oder 
dann zu vergroGern, wenn ein lokaler Minimalzustanddes Signalverarbeitungssystems detektiert wurde, wo- 
bei der Lernverarbeitungsabschnitt (60) so ausgebildet ist, daG er in den nachfolgenden wiederholten Durch- 
fuhrungen des Lernprozesses die Lemverarbeitung der Kopplungskoeffizienten Wj, bezuglich der vergroGer- 
ten Zahl von Einheiten in der verborgenen Schicht durchf uhrt. 

Lernverarbeitungssystem nach Anspruch 1 , bei dem die Steuereinrichtung des Lern-Verarbeitungsabschnitts (60) 
den lokalen Minimalzustand durch Vergleichen von aufeinanderfolgenden Werten einer ersten Variablen Lms de- 
tektieren kann, wobei 



Lms = E I (t pi - O p ,) 2 . 



40 Revendications 

1. Systeme de traitement d'apprentissage (50, 60) comprenant : 

une section de traitement de signaux (50) constitute d'un reseau neuronal multicouche ayant une couche 
45 d'entree (L,), une couche intermediate (L H ) et une couche de sortie (L D ), les couches etant respectivement 

formees d'unites u n a U| X , u m a u Hy , u Q1 a u Qz , chaque unite correspondant a un neurone ; 
une section de traitement d'apprentissage (60) qui execute un traitement d'apprentissage utilisant un algo- 
rithme d'apprentissage de retropropagation, ce traitement consistant a modifier sequentiellement, de la couche 
de sortie vers la couche d'entree, les coefficients de couplage Wj, de toutes les unites j de la couche interme- 
so diaire et de la couche de sortie a I'aide d'un coefficient de variation AWj, de fagon a minimiser la somme totale 

des erreurs quadratiques existant entre la valeur de sortie reelle O pj de I'unite j de la couche de sortie (Lq) 
produite a partir d'une forme de signal d'entree (p) et la valeur de sortie souhaitable t pj (signal d'enseignement) 
pour ladite unite j de la couche de sortie (Lq), si bien que est le poids du signal de la 'P"> e a la j 6me unite, 
la section de traitement d'apprentissage recevant une valeur de sortie voulue t pj au titre d'un signal d'ensei- 
55 gnement, pour la valeur de sortie O pj de I'unite j de la couche de sortie (Lq), pour les formes d'entree p intro- 

•duites dans la couche d'entree (L|), 

la section de traitement d'apprentissage (60) calculant la valeur d'erreur pour chaque unite de la couche de 
sortie et de la couche intermediate, 
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ledit traitement d'apprentissage etant execute de facon repetee jusqu'a ce que la somme totale (E) des erreurs 
quadratiques existant entre le signal de sortie souhaite, qui est fourni au t'rtre du signal d'enseignement, et le 
signal de sortie sort devenue suffisamment petite, 

caracterise en ce que la section de traitement d'apprentissage (60) comprend un moyen de commande 
servant a augmenter le nombre des unites de la couche intermediate (L H ), pendant I'execution repetee dudit 
traitement d'apprentissage, ou bien periodiquement, pu bien lorsqu'un etat de minimum local du systeme de trai- 
tement de signaux a ete detecte, ladite section de traitement d'apprentissage (60) etant concue pour, lors d'exe- 
cutions repetees suivantes du traitement d'apprentissage, effectuer le traitement d'apprentissage des coefficients 
Wj| d'intensite de couplage relativement au nombre augmente d'unites de la couche intermediaire. 

Systeme de traitement d'apprentissage selon la revendication 1, ou le moyen de commande de la section de 
traitement d'apprentisssage (60) est concu pour detecter un etat de minimum local par comparaisons de valeurs 
successives d'une premiere variable, L ms , ou : 



Lms= 2 2 Opi-Opi) 2 
p-1 i-1 
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